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We describe in detail a new and highly efficient algorithm for studying site or bond percolation 
on any lattice. The algorithm can measure an observable quantity in a percolation system for all 
values of the site or bond occupation probability from zero to one in an amount of time which scales 
linearly with the size of the system. We demonstrate our algorithm by using it to investigate a 
number of issues in percolation theory, including the position of the percolation transition for site 
percolation on the square lattice, the stretched exponential behavior of spanning probabilities away 
from the critical point, and the size of the giant component for site percolation on random graphs. 
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I. INTRODUCTION 

Percolation Q is certainly one of the best studied prob- 
lems in statistical physics. Its statement is trivially sim- 
ple: in site percolation every site on a specified lattice is 
independently either "occupied," with probability p, or 
not with probability I — p. The occupied sites form con- 
tiguous clusters which have some interesting properties. 
In particular, the system shows a continuous phase tran- 
sition at a finite value of p which, on a regular lattice, is 
characterized by the formation of a cluster large enough 
to span the entire system from one side to the other in 
the limit of infinite system size. We say such a system 
"percolates." As the phase transition is approached from 
small values of p, the average cluster size diverges in a 
way reminiscent of the divergence of fluctuations in the 
approach to a thermal continuous phase transition, and 
indeed one can define correlation functions and a correla- 
tion length in the obvious fashion for percolation models, 
and hence measure critical exponents for the transition. 

One can also consider bond percolation in which the 
bonds of the lattice are occupied (or not) with probability 
p (or 1—p), and this system shows behavior qualitatively 
similar to though different in some details from site per- 
colation. 

Site and bond percolation have found a huge variety of 
uses in many fields. Percolation models appeared orig- 
inally in studies of actual percolation in materials ffl- 
percolation of fluids through rock for example — but 
have since been used in studies of many other systems, in- 
cluding granular materials (||| 7 composite materials , 
polymers concrete 0, aerogel and other porous me- 
dia ficj Jllf] , and many others. Percolation also finds uses 
outside of physics, where it has been employed to model 
resistor networks [O, forest fires |f|] and other ecolog- 
ical disturbances E3], epidemic s Ea£Q| , robustness of 
the Internet and other networks Jl7| , [18| , biological evolu- 
tion p9[ , and social influence [20|, amongst other things. 
It is one of the simplest and best understood examples 
of a phase transition in any system, and yet there are 
many things about it that are still not known. For ex- 
ample, despite decades of effort, no exact solution of the 



site percolation problem yet exists on the simplest two- 
dimensional lattice, the square lattice, and no exact re- 
sults are known on any lattice in three dimensions or 
above. Because of these and many other gaps in our cur- 
rent understanding of percolation, numerical simulations 
have found wide use in the field. 

Computer studies of percolation are simpler than most 
simulations in statistical physics, since no Markov pro- 
cess or other importance sampling mechanism is needed 
to generate states of the lattice with the correct prob- 
ability distribution. We can generate states simply by 
occupying each site or bond on an initially empty lattice 
with independent probability p. Typically, we then want 
to find all contiguous clusters of occupied sites or bonds, 
which can be done using a simple depth- or breadth-first 
search. Once we have found all the clusters, we can easily 
calculate, for example, average cluster size, or look for a 
system spanning cluster. Both depth- and breadth-first 
searches take time O(M) to find all clusters, where M 
is the number of bonds on the lattice. This is optimal, 
since one cannot check the status of M individual bonds 
or adjacent pairs of sites in any less than O(M) opera- 
tions pi] ]. On a regular lattice, for which M = \zN , 
where z is the coordination number and N is the number 
of sites, O(M) is also equivalent to O(N). 

But now consider what happens if, as is often the case, 
we want to calculate the values of some observable quan- 
tity Q (e.g., average cluster size) over a range of values 
of p. Now we have to perform repeated simulations at 
many different closely-spaced values of p in the range of 
interest, which makes the calculation much slower. Fur- 
thermore, if we want a continuous curve of Q{p) in the 
range of interest then in theory we have to measure Q 
at an infinite number of values of p. More practically, 
we can measure it at a finite number of values and then 
interpolate between them, but this inevitably introduces 
some error into the results. 

The latter problem can be solved easily enough. The 
trick [|22|]23| 1 is to measure Q for fixed numbers of occu- 
pied sites (or bonds) n in the range of interest. Let us 
refer to the ensemble of states of a percolation system 
with exactly n occupied sites or bonds as a "microcanon- 
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ical percolation ensemble," the number n playing the role 
of the energy in thermal statistical mechanics. The more 
normal case in which only the occupation probability p 
is fixed is then the "canonical ensemble" for the problem. 
(If one imagines the occupied sites or bonds as represent- 
ing particles instead, then the two ensembles would be 
"canonical" and "grand canonical," respectively. Some 
authors have used this nomenclature.) Taking site 
percolation as an example, the probability of there being 
exactly n occupied sites on the lattice for a canonical per- 
colation ensemble is given by the binomial distribution: 
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(The same expression applies for bond percolation, but 
with N replaced by M, the total number of bonds.) 
Thus, if we can measure our observable within the mi- 
crocanonical ensemble for all values of n, giving a set 
of measurements {Q n }, then the value in the canonical 
ensemble will be given by 
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Thus we need only measure Q„ for all values of n, of 
which there are 2V+1, in order to find Q(p) for all p. Since 
filling the lattice takes time O(N), and construction of 
clusters O(M), we take time 0(M + N) for each value of 
n, and 0(N 2 + MN) for the entire calculation, or more 
simply 0(iV 2 ) on a regular lattice. Similar considerations 
apply for bond percolation. 

In fact, one frequently does not want to know the value 
of Q(p) over the entire range < p < 1, but only in the 
critical region, i.e., the region in which the correlation 
length £ is greater than the linear dimension L of the 
system. If £ ~ \p — p c \~ u close to p c , where v is a (con- 
stant) critical exponent, then the width of the critical 
region scales as L~ x l v . Since the binomial distribution, 
Eq. ([!]), falls off very fast away from its maximum (ap- 
proximately as a Gaussian with variance of order iV -1 ), 
it is safe also only to calculate Q n within a region scal- 
ing as L" 1 /", and there are order L d ^ 1 l v values of N in 
such a region, where d is the dimensionality of the lat- 
tice. Thus the calculation of Q(p) in the critical region 
will take time of order NL d ~ x l u = N 2 - x / du . In two di- 
mensions, for example, v is known to take the value |, 
and so we can calculate any observable quantity over the 
whole critical region in time 0(N 13 / S ). 

An alternative technique for calculating Q(p) over a 
range of values of p is to use a histogram method. (This 
should not be confused with the method of Hu , which 
is also referred to as a "histogram method," but which 
is different from the method described here. Our his- 
togram method is the equivalent for a percolation model 
of the method of Ferrenberg and Swendsen |^5| for ther- 
mal models.) Suppose that one performs a number M of 



simulations of a percolation system, each one generating 
a single state \x of the system drawn from some prob- 
ability distribution P^. In each state we measure our 
observable of interest Q, giving a set of measurements 
{Qfj,}- Then our estimate of the value of Q(p) is given by 
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where is the number of occupied sites or bonds 
in state fx. In normal (canonical) simulations of per- 
colation Pp. is just equal to the binomial distribution 
B(N , n M ,p), which then cancels out of Eq. (|j|) to give 
Q(p) — jj ^2fi m the usual fashion. In microcanon- 
ical simulations is uniform over values of n, and 
Eq. (||) is equivalent to Eq. (||). Suppose however that 
instead P M corresponds to the distribution of states pro- 
duced by a canonical simulation at a different value po 
of the site or bond occupation probability. In this case 
Py, = B(N,n tJl ,po), and Eq. (||) becomes 
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This equation tells us that if we perform a simulation 
or set of simulations at occupation probability po, an d 
record the value(s) of and for each state generated, 
we can use the results to estimate Q{p) at any other 
value of p. In practice, the range over which we can 
reliably extrapolate away from po using this method is 
limited by the range of values of n sampled; if the values 
of n sampled at po make negligible contributions to the 
canonical ensemble of states at p, then extrapolation will 
give poor results. Since the values are drawn from 
a binomial distribution with width yj Npo(l — po), the 
entire range of n can be sampled with 0(N 1 ^ 2 ) separate 
simulations at different values of po- Each simulation 
takes time O(N) to complete, and hence it takes time 
0(iV 3 / 2 ) to calculate Q(p) for all values of p. If we are 
only interested in the critical region, this time is reduced 
to 0(iV 3/2 - 1/d,y ), which is 0(7V 9/8 ) in two dimensions. 
These times are considerably faster than those for the 
direct method described above of performing simulations 
for all values of n, but this speed is offset by the fact 
that the histogram method gives larger statistical errors 
on measured quantities than the direct method p6| , p7f . 

In this paper we describe a new algorithm which can 
perform the calculation of Q(p) for all values of p faster 
than any of the algorithms above, in time 0(7V). This is 
clearly far superior to the 0(./V 2 ) direct algorithm while 
having comparable accuracy in terms of statistical error. 
It is also substantially faster than the 0(7V 3 / 2 ) histogram 
method while having significantly better statistical accu- 
racy. And even in the case where we only want to calcu- 
late Q in the critical region, our algorithm is still faster 
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doing all values of n, than previous algorithms doing only 
those in the critical region. (Because of the way it works, 
our algorithm cannot be used to calculate Q n only for 
the critical region; one is obliged to calculate O(N) val- 
ues of Q n -) A typical lattice size for percolation systems 
is about a million sites. On such a lattice our algorithm 
would be on the order of a million times faster than the 
simple 0(iV 2 ) method and a thousand times faster than 
the histogram method, and even just within the critical 
region it would still be around six times faster than the 
histogram method in two dimensions, while giving sub- 
stantially better statistical accuracy. 

The outline of this paper is as follows. In Section |l| 
we describe our algorithm. In Section III we give results 
from the application of the algorithm to three different 
example problems. In Section ^ we give our conclu- 
sions. We have reported some of the results given here 
previously in Ref. p8j|. 



II. THE ALGORITHM 

Our algorithm is based on a very simple idea. In the 
standard algorithms for percolation, one must create an 
entire new state of the lattice for every different value of 
n one wants to investigate, and construct the clusters for 
that state. As various authors have pointed out, how- 
ever |^2|,^3|,^9|,|o) , if we want to generate states for each 
value of n from zero up to some maximum value, then 
we can save ourselves some effort by noticing that a cor- 
rect sample state with n + 1 occupied sites or bonds is 
given by adding one extra randomly chosen site or bond 
to a correct sample state with n sites or bonds. In other 
words, we can create an entire set of correct percolation 
states by adding sites or bonds one by one to the lattice, 
starting with an empty lattice. 

Furthermore, the configuration of clusters on the lat- 
tice changes little when only a single site or bond is 
added, so that, with a little ingenuity, one can calculate 
the new clusters from the old with only a small amount 
of computational effort. This is the idea at the heart of 
our algorithm. 

Let us consider the case of bond percolation first, for 
which our algorithm is slightly simpler. We start with a 
lattice in which all M bonds are unoccupied, so that each 
of the N sites is its own cluster with just one element. 
As bonds are added to the lattice, these clusters will be 
joined together into larger ones. The first step in our 
algorithm is to decide an order in which the bonds will 
be occupied. We wish to choose this order uniformly at 
random from all possible such orders, i.e., we wish to 
choose a random permutation of the bonds. One way of 
achieving this is the following: 

1. Create a list of all the bonds in any convenient 
order. Positions in this list are numbered from 1 
to M. 
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FIG. 1. Two examples of bonds (dotted lines) being added 
to bond-percolation configurations, (a) The added bond joins 
two clusters together, (b) The added bond does nothing, since 
the two joined sites were already members of the same cluster. 
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3. Choose a number j uniformly at random in the 
range i < j < M. 

4. Exchange the bonds in positions i and j. (If i = j 
then nothing happens.) 

5. Set i <- i + 1. 

6. Repeat from step 3 until i = M. 

It is straightforward to convince oneself that all permu- 
tations of the bonds are generated by this procedure with 
equal probability, in time O(M), i.e., going linearly with 
the number of bonds on the lattice. 

Having chosen an order for our bonds, we start to oc- 
cupy them in that order. The first bond added will, 
clearly, join together two of our single-site clusters to 
form a cluster of two sites, as will, almost certainly, the 
second and third. However, not all bonds added will join 
together two clusters, since some bonds will join pairs 
of sites which are already part of the same cluster — see 
Fig. [|. Thus, in order to correctly keep track of the clus- 
ter configuration of the lattice, we must do two things 
for each bond added: 

Find: when a bond is added to the lattice we must 
find which clusters the sites at either end belong to. 

Union: if the two sites belong to different clusters, 
those clusters must be amalgamated into a single cluster; 
otherwise, if the two belong to the same cluster, we need 
do nothing. 

Algorithms which achieve these steps are known as 
"union/find" algorithms [ pl|j32| . Union/find algorithms 
are widely used in data structures, in calculations on 
graphs and trees, and in compilers. They have been ex- 
tensively studied in computer science, and we can make 
profitable use of a number of results from the computer 
science literature to implement our percolation algorithm 
simply and efficiently. 

It is worth noting that measured values for lattice 
quantities such as, say, average cluster size, are not sta- 
tistically independent in our method, since the configu- 
ration of the lattice changes at only one site from one 
step of the algorithm to the next. This contrasts with 
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fraction of occupied bonds 



size of system N 



FIG. 2. Number of relabelings of sites performed, per bond 
added, as a function of fraction of occu pied bonds, for the 
two algorithms described in Section 
of 32 x 32 sites. 
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on a square lattice 



the standard algorithms, in which a complete new con- 
figuration is generated for every value of n or p investi- 
gated, and hence all data points are statistically indepen- 
dent. While this is in some respects a disadvantage of our 
method, it turns out that in most cases of interest it is 
not a problem, since almost always one is concerned only 
with statistical independence of the results from one run 
to another. This our algorithm clearly has, which means 
that an error estimate on the results made by calculating 
the variance over many runs will be a correct error esti- 
mate, even though the errors on the observed quantity 
for successive values of n or p will not be independent. 

In the next two sections we apply a number of dif- 
ferent and increasingly efficient union / find algorithms to 
the percolation problem, culminating with a beautifully 
simple and almost linear algorithm associated with the 
names of Michael Fischer and Robert Tarjan. 



A. Trivial (but possibly quite efficient) 
union/find algorithms 

Perhaps the simplest union/find algorithm which we 
can employ for our percolation problem is the following. 
We add a label to each site of our lattice — an integer for 
example — which tells us to which cluster that site be- 
longs. Initially, all such labels are different (e.g., they 
could be set equal to their site label). Now the "find" 
portion of the algorithm is simple — we examine the la- 
bels of the sites at each end of an added bond to see if 
they are the same. If they are not, then the sites be- 
long to different clusters, which must be amalgamated 
into one as a result of the addition of this bond. The 
amalgamation, the "union" portion of the algorithm, is 



FIG. 3. Total number of relabelings of sites performed in a 
single run as a function of syste m siz e for the two algorithms 
(circles and squares) of Section [I A. Each data point is an 
average over 1000 runs; error bars are much smaller than the 
points themselves. The solid lines are straight-line fits to the 
rightmost five points in each case. 



more involved. To amalgamate two clusters we have to 
choose one of them — in the simplest case we just choose 
one at random — and set the cluster labels of all sites in 
that cluster equal to the cluster label of the other cluster. 

In the initial stages of the algorithm, when most bonds 
are unoccupied, this amalgamation step will be quick and 
simple; most bonds added will amalgamate clusters, but 
the clusters will be small and only a few sites will have 
to be relabeled. In the late stages of the algorithm, when 
most bonds are occupied, all or most sites will belong 
to the system-size percolating cluster, and hence cluster 
unions will rarely be needed, and again the algorithm 
is quick. Only in the intermediate regime, close to the 
percolation point, will any significant amount of work be 
required. In this region, there will in general be many 
large clusters, and much relabeling may have to be per- 
formed when two clusters are amalgamated. Thus, the 
algorithm displays a form of critical slowing down as it 
passes through the critical region. We illustrate this in 
Fig. U (solid line) , where we show the number of site rela- 
belings taking place, as a function of fraction of occupied 
bonds, for bond percolation on a square lattice of 32 x 32 
sites. The results are averaged over a million runs of the 
algorithm, and show a clear peak in the number of rela- 
belings in the region of the known percolation transition 
at bond occupation probability p c = h ■ 

In Fig. H (circles) we show the average total number of 
relabelings which are carried out during the entire run of 
the algorithm on square lattices of N = L x L sites as a 
function of N. Given that, as the number of relabelings 
becomes large, the time taken to relabel will be the dom- 
inant factor in the speed of this algorithm, this graph 
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gives an indication of the expected scaling of run-time 
with system size. As we can see, the results lie approxi- 
mately on a straight line on the logarithmic scales used 
in the plot and a fit to this line gives a run-time which 
scales as N a , with a=1.91±0.01. This is slightly better 
than the 0(N 2 ) behavior of the standard algorithms, and 
in practice can make a significant difference to running 
time on large lattices. With a little ingenuity however, 
we can do a lot better. 

The algorithm's efficiency can be improved consider- 
ably by one minor modification: we maintain a separate 
list of the sizes of the clusters, indexed by their cluster 
label, and when two clusters are amalgamated, instead 
of relabeling one of them at random, we look up their 
sizes in this list and then relabel the smaller of the two. 
This "weighted union" strategy ensures that we carry 
out the minimum possible number of relabelings, and in 
particular avoids repeated relabelings of the large perco- 
lating cluster when we are above the percolation transi- 
tion, which is a costly operation. The average number of 
relabelings performed in this algorithm as a function of 
number of occupied bonds is shown as the dotted line in 
Fig. ||, and it is clear that this change in the algorithm 
has saved us a great deal of work. In Fig. ^ (squares) 
we show the total number of relabelings as a function 
of system size, and again performance is clearly much 
better. The run-time of the algorithm now appears to 
scale as iV a with a =1.06 ±0.01. In fact, we can prove 
that the worst-case running time goes as NlogN, a form 
which frequently manifests itself as an apparent power 
law with exponent slightly above 1. To see this con- 
sider that with weighted union the size of the cluster to 
which a site belongs must at least double every time that 
site is relabeled. The maximum number of such dou- 
blings is limited by the size of the lattice to log 2 N, and 
hence the maximum number of relabelings of a site has 
the same limit. An upper bound on the total number of 
relabelings performed for all N sites during the course 
of the algorithm is therefore N log 2 N. (A similar ar- 
gument applied to the unweighted relabeling algorithm 
implies that N 2 is an upper bound on the running time 
of this algorithm. The measured exponent 1.91 indicates 
either that the experiments we have performed have not 
reached the asymptotic scaling regime, or else that the 
worst-case running time is not realized for the particular 
case of union of percolation clusters on a square lattice.) 

An algorithm whose running time scales as 0(N log N) 
is a huge advance over the standard methods. However, 
it turns out that, by making use of some known tech- 
niques and results from computer science, we can make 
our algorithm better still while at the same time actu- 
ally making it simpler. This delightful development is 
described in the next section. 




FIG. 4. An example of the tree structure described in the 
text, here representing two clusters. The shaded sites are the 
root sites and the arrows represent pointers. When a bond 
is added (dotted line in center) that joins sites which belong 
to different clusters (i.e., whose pointers lead to different root 
sites), the two clusters must be amalgamated by making one 
(left) a subtree of the other (right). This is achieved by adding 
a new pointer from the root of one tree to the root of the other. 



B. Tree-based union/find algorithms 

Almost all modern union/find algorithms make use of 
data trees to store the sets of objects (in this case clus- 
ters) which are to be searched. The idea of using a tree 
in this way seems to have been suggested first by Galler 
and Fischer p3| . Each cluster is stored as a separate tree, 
with each vertex of the tree being a separate site in the 
cluster. Each cluster has a single "root" site, which is the 
root of the corresponding tree, and all other sites possess 
pointers either to that root site or to another site in the 
cluster, such that by following a succession of such point- 
ers we can get from any site to the root. By traversing 
trees in this way, it is simple to ascertain whether two 
sites are members of the same cluster: if their pointers 
lead to the same root site then they are, otherwise they 
are not. This scheme is illustrated for the case of bond 
percolation on the square lattice in Fig. |[ 

The union operation is also simple for clusters stored as 
trees: two clusters can be amalgamated simply by adding 
a pointer from the root of one to any site in the other 
(Fig. ||). Normally, one chooses the new pointer to point 
to the root of the other cluster, since this reduces the 
average distance that will be have to be traversed across 
the tree to reach the root site from other sites. 

There are many varieties of tree-based union/find al- 
gorithms, which differ from one another in the details of 
how the trees are updated as the algorithm progresses. 
However, the best known performance for any such algo- 
rithm is for a very simple one — the "weighted union/find 
with path compression" — which was first proposed by 
Fischer |3^j3^,Q. Its description is brief: 

Find: In the find part of the algorithm, trees are tra- 
versed to find their root sites. If two initial sites lead 
to the same root, then they belong to the same cluster. 
In addition, after the traversal is completed, all pointers 
along the path traversed are changed to point directly to 
the root of their tree. This is called "path compression." 
Path compression makes traversal faster the next time 
we perform a find operation on any of the same sites. 
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Union: In the union part of the algorithm, two clus- 
ters are amalgamated by adding a pointer from the root 
of one to the root of the other, thus making one a subtree 
of the other. We do this in a "weighted" fashion as in 



implem entat ion of the algorithm is discussed in detail in 
Section II D and in Appendix A. 



Section [I A, meaning that we always make the smaller 
of the two a subtree of the larger. In order to do this, we 
keep a record at the root site of each tree of the number 
of sites in the corresponding cluster. When two clusters 
are amalgamated the size of the new composite cluster is 
the sum of the sizes of the two from which it was formed. 

Thus our complete percolation algorithm can be sum- 
marized as follows. 

1. Initially all sites are clusters in their own right. 
Each is its own root site, and contains a record 
of its own size, which is 1. 

2. Bonds are occupied in random order on the lattice. 

3. Each bond added joins together two sites. We fol- 
low pointers from each of these sites separately un- 
til we reach the root sites of the clusters to which 
they belong. Root sites are identified by the fact 
that they do not possess pointers to any other 
sites. Then we go back along the paths we fol- 
lowed through each tree and adjust all pointers 
along those paths to point directly to the corre- 
sponding root sites. 

4. If the two root sites are the same site, we need do 
nothing further. 

5. If the two root sites are different, we examine the 
cluster sizes stored in them, and add a pointer from 
the root of the smaller cluster to the root of the 
larger, thereby making the smaller tree a subtree 
of the larger one. If the two are the same size, we 
may choose whichever tree we like to be the subtree 
of the other. We also update the size of the larger 
cluster by adding the size of the smaller one to it. 

These steps are repeated until all bonds on the lattice 
have been occupied. At each step during the run, the 
tree structures on the lattice correctly describe all of the 
clusters of joined sites, allowing us to evaluate observable 
quantities of interest. For example, if we are interested in 
the size of the largest cluster on the lattice as a function 
of the number of occupied bonds, we simply keep track 
of the largest cluster size we have seen during the course 
of the algorithm. 

Although this algorithm may see m m ore involved than 
the relabeling algorithm of Section II A , it turns out that 
its implementation is actually simpler, for two reasons. 
First, there is no relabeling of sites, which eliminates the 
need for the code portion which tracks down all the sites 
belonging to a cluster and updates them. Second, the 
only difficult parts of the algorithm, the tree traversal 
and path compression, can it turns out be accomplished 
together by a single function which, by artful use of re- 
cursion, can be coded in only three lines (two in C). The 



C. Performance of the algorithm 

Without cither weighting or path compression each 
step of the algorithm above is known to take time lin- 
ear in the tree size p4| , while with either weighting or 
path compression alone it takes logarithmic time p4] , p5| . 
When both are used together, however, each step takes 
an amount of time which is very nearly constant, in a 
sense which we now explain. 

The worst-case performance of the weighted union/find 
algorithm with path compression has been analyzed by 
Tarjan ]3q ] in the general case in which one starts with 
n individual single-element sets and performs all n — 1 
unions (each one reducing the number of sets by 1 from 
n down to 1), and any number to > n of intermediate 
finds. The finds are performed on any sequence of items, 
and the unions may be performed in any order. Our case 
is more restricted. We always perform exactly 2M finds, 
and only certain unions and certain finds may be per- 
formed. For example, we only ever perform finds on pairs 
of sites which are adjacent on the lattice. And unions can 
only be performed on clusters which are adjacent to one 
another along at least one bond. However, it is clear that 
the worst-case performance of the algorithm in the most 
general case also provides a bound on the performance 
of the algorithm in any more restricted case, and hence 
Tarjan's result applies to our algorithm. 

Tarjan's analysis is long and complicated. However, 
the end result is moderately straightforward. The union 
part of the algorithm clearly takes constant time. The 
find part takes time which scales as the number of steps 
taken to traverse the tree to its root. Tarjan showed 
that the average number of steps is bounded above by 
ka(m, n), where k is an unknown constant and the func- 
tion a(rn,n) is the smallest integer value of z greater 
than or equal to 1, such that A(z,A\m/n\) > log 2 n. 
The function A(i,j) in this expression is a slight variant 
on Ackermann's function fl36| defined by 



A(i,Q) = 
A(i,l) = 2 

A(i,j) = A{i-l,A{i,j-l)) 



for all j 
for i > 1 
for i > 1 

for i > 1, j > 2. (5) 



This function is a monstrously quickly growing function 
of its arguments — faster than any multiple exponential — 
which means that a(m, n), which is roughly speaking the 
functional inverse of A(i, j), is a very slowly growing func- 
tion of its arguments. So slowly growing, in fact, that for 
all practical purposes it is a constant. In particular, note 
that a(m, n) is maximized by setting to = n, in which 
case if log 2 n < A(z, 4) then a(m, n) < a(n, n) < z. Set- 
ting z = 3 and making use of Eq. (@) , we find that 
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FIG. 5. Total number of steps taken through trees dur- 
ing a single run of the our algorithm as a function of system 
size. Each point is averaged over 1000 runs. Statistical errors 
are much smaller than the data points. The solid line is a 
straight-line fit to the last five points, and gives a slope of 
1.006 ±0.011. 



4(3,4) 



65 536 twos. 



(6) 



This number is ludicrously large. For example, with only 
the first 5 out of the 65 536 twos, we have 



2.0 x 10 



19728 



(7) 



This number is far, far larger than the number of atoms 
in the known universe. Indeed, a generous estimate of 
the number of Planck volumes in the universe would put 
it at only around 10 200 . So it is entirely safe to say that 
log 2 n will never exceed >1(3, 4), bearing in mind that n in 
our case is equal to N, the number of sites on the lattice. 
Hence u(m,n) < 3 for all practical values of m and n 
and certainly for all lattices which fit within our universe. 
Thus the average number of steps needed to traverse the 
tree is at least one and at most 3fc, where k is an unknown 
constant. The value of k is found experimentally to be 
of order 1. 

What does this mean for our percolation algorithm? 
It means that the time taken by both the union and find 
steps is 0(1) in the system size, and hence that each 
bond added to the lattice takes time 0(1). Thus it takes 
time O(M) to add all M bonds, while maintaining an 
up-to-date list of the clusters at all times. (To be pre- 
cise, the time to add all M bonds is bounded above by 
0(Ma(2M,N)), where a(2M, N) is not expected to ex- 
ceed 3 this side of kingdom come.) 

In Fig. U we show the actual total number of steps 
taken in traversing trees during the entire course of a 
run of the algorithm averaged over several runs for each 



algorithm 


time in seconds 


depth-first search 


4 500 000 


unweighted relabeling 


16 000 


weighted relabeling 


4.2 


tree-based algorithm 


2.9 



TABLE I. Time in seconds for a single run of each of the 
algorithms discussed in this paper, for bond percolation on a 
square lattice of 1000 x 1000 sites. 



data point, as a function of system size. A straight-line 
fit to the data shows that the algorithm runs in time pro- 
portional to N a , with a — 1.006 ±0.011. (In actual fact, 
if one measures the performance of the algorithm in real 
( "wallclock" ) time, it will on most computers (circa 2001) 
not be precisely linear in system size because of the in- 
creasing incidence of cache misses as N becomes large. 
This however is a hardware failing, rather than a prob- 
lem with the algorithm.) 

We illustrate the comparative superiority of perfor- 
mance of our algorithm in Table [| with actual timing 
results for it and the three other algorithms described in 
this paper, for square lattices of 1000 x 1000 sites, a typi- 
cal size for numerical studies of percolation. All programs 
were compiled using the same compiler and run on the 
same computer. As the table shows, our algorithm, in 
addition to being the simplest to program, is also easily 
the fastest. In particular we find it to be about 2 mil- 
lion times faster than the simple depth-first search p7| , 
and about 50% faster than the best of our relabeling al- 

For 



gorithms, the weighted relabeling of Section II A 
larger systems the difference will be greater still. 

Site percolation can be handled by only a very slight 
modification of the algorithm. In this case, sites are oc- 
cupied in random order on the lattice, and each site oc- 
cupied is declared to be a new cluster of size 1. Then 
one adds, one by one, all bonds between this site and the 
occupied sites, if any, adjacent to it, using the algorithm 
described above. The same performance arguments ap- 
ply: generation of a permutation of the sites takes time 
O(N), and the adding of all the bonds takes time O(M), 
and hence the entire algorithm takes time 0(M ± N). 
On a regular lattice, for which M — \zN , where z is the 
coordination number, this is equivalent to 0(N). 

It is worth also mentioning the memory requirements 
of our algorithm. In its simplest form, the algorithm re- 
quires two arrays of size N for its operation; one is used 
to store the order in which sites will be occupied (for 
bond percolation this array is size M), and one to store 
the pointers and cluster sizes. The standard depth-first 
search also requires two arrays of size N (a cluster label 
array and a stack), as does the weighted relabeling al- 
gorithm (a label array and a list of cluster sizes). Thus 
all the algorithms are competitive in terms of memory 
use. (The well-known Hoshen-Kopelman algorithm |B8) , 



7 



which is a variation on depth-first search and runs in 
0(N 2 logN) time, has significantly lower memory re- 
quirements, of order N 1 ~ 1 / d , which makes it useful for 
studies of particularly large low-dimensional systems.) 




D. Implementation 

In Appendix A we give a complete computer program 
for our algorithm for site percolation on the square lattice 
written in C. The program is also available for download 
on the Internet |3^]. As the reader will see, the imple- 
mentation of the algorithm itself is quite simple, taking 
only a few lines of code. In this section, we make a few 
points about the algorithm's implementation, which may 
be helpful to those wishing to make use of it. 

First, we note that the "find" operation in which a tree 
is traversed to find its root and the path compression 
can be conveniently performed together using a recursive 
function which in pseudocode would look like this: 

function find(i: integer): integer 

if ptr[i] < return i 

ptr[i] :— find(ptr[i]) 

return ptr[i] 
end function 

This function takes an integer argument, which is the 
label of a site, and returns the label of the root site of 
the cluster to which that site belongs. The pointers are 
stored in an array ptr, which is negative for all root nodes 
and contains the label of the site pointed to otherwise. A 
version of this function in C is included in Appendix A. A 
non-recursive implementation of the function is of course 
possible (as is the case with all uses of recursion) , and we 
give two such implementations in the appendix for those 
who prefer this approach. However, both are somewhat 
more complicated than the recursive version, and with 
the benefit of a modern optimizing compiler are found to 
offer little speed advantage. 

The program given in Appendix A measures the largest 
cluster size as a function of number of occupied sites. 
Above the percolation transition in the limit of large sys- 
tem size this quantity is equal to the size of the giant 
component, and it also has interesting scaling properties 
below the transition |jc|| . However, there a many other 
quantities one might want to measure using our algo- 
rithm. Doing so usually involves keeping track of some 
additional variables as the algorithm runs. Here are three 
examples. 

1. Average cluster size: In order to measure the 
average number of sites per cluster in site percola- 
tion it is sufficient to keep track of only the total 
number c of clusters on the lattice. Then the av- 
erage cluster size is given by n/c, where n is the 
number of occupied sites. To keep track of c, we 
simply set it equal to zero initially, increase it by 
one every time a site is occupied, and decrease it 




FIG. 6. Initial configuration of occupied (gray) and empty 
(white) sites for checking for the presence of a spanning clus- 
ter. In this example there are no periodic boundary condi- 
tions on the lattice. As the empty sites are filled up during 
the course of the run, the two sites marked x will have the 
same cluster root site if and only if there is a spanning cluster 
on the lattice. 



by one every time we perform a union operation. 
Similarly, for averages weighted by cluster size one 
can keep track of higher moments of the cluster-size 
distribution. 

2. Cluster spanning: In many calculations one 
would like to detect the onset of percolation in the 
system as sites or bonds are occupied. One way of 
doing this is to look for a cluster of occupied sites 
or bonds which spans the lattice from one side to 
the other. Taking the example of site percolation, 
one can test for such a cluster by starting the lat- 
tice in the configuration shown in Fig. ^| in which 
there are occupied sites along two edges of the lat- 
tice and no periodic boundary conditions. (Notice 
that the lattice is now not square.) Then one pro- 
ceeds to occupy the remaining sites of the lattice 
one by one in our standard fashion. At any point, 
one can check for spanning simply by performing 
"find" operations on each of the two initial clus- 
ters, starting for example at the sites marked x . 
If these two clusters have the same root site then 
spanning has occurred. Otherwise it has not. 

3. Cluster wrapping: An alternative criterion for 
percolation is to use periodic boundary conditions 
and look for a cluster which wraps all the way 
around the lattice. This condition is somewhat 
more difficult to detect than simple spanning, but 
with a little ingenuity it can be done, and the ex- 
tra effort is, for some purposes, worthwhile. An 
example is in the measurement of the position 
p c of the percolation transition. As discussed in 
Section III A , estimates of p c made using a clus- 
ter wrapping condition display significantly smaller 
finite-size errors than estimates made using cluster 
spanning on open systems. 

A clever method for detecting cluster wrapping has 
been employed by Machta et al. pll in simulations 
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The total number of attempts made before we reach p c 
is therefore 
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FIG. 7. Method for detecting cluster wrapping on periodic 
boundary conditions. When a bond is added (a) which joins 
together two sites which belong to the same cluster, it is pos- 
sible, as here, that it causes the cluster to wrap around the 
lattice. To detect this, the displacements to the root site 
of the cluster (shaded) are calculated (arrows). If the dif- 
ference between these displacements is not equal to a single 
lattice spacing, then wrapping has taken place. Conversely if 
the bond (b) where added, the displacements to the root site 
would differ by only a single lattice spacing, indicating that 
wrapping has not taken place. 



of Potts models, and can be adapted to the case 
of percolation in a straightforward manner. We de- 
scribe the method for bond percolation, although it 
is easily applied to site percolation also. We add to 
each site two integer variables giving the x- and y- 
displacements from that site to the site's parent in 
the appropriate tree. When we traverse the tree, 
we sum these displacements along the path tra- 
versed to find the total displacement to the root 
site. (We also update all displacements along the 
path when we carry out the path compression.) 
When an added bond connects together two sites 
which are determined to belong to the same cluster, 
we compare the total displacements to the root site 
for those two sites. If these displacements differ by 
just one lattice spacing, then cluster wrapping has 
not occurred. If they differ by any other amount, 
it has. This method is illustrated in Fig. |7|. 

It is worth noting also that, if one's object is only to 
detect the onset of percolation, then one can halt the al- 
gorithm as soon as percolation is detected. There is no 
need to fill in any more sites or bonds once the perco- 
lation point is reached. This typically saves about 50% 
on the run-time of the algorithm, the critical occupation 
probability p c being of the order of i. A further small 
performance improvement can often be achieved in this 
case by noting that it is no longer necessary to gener- 
ate a complete permutation of the sites (bonds) before 
starting the algorithm. In many cases it is sufficient sim- 
ply to choose sites (bonds) one by one at random as the 
algorithm progresses. Sometimes in doing this one will 
choose a site (bond) which is already occupied, in which 
case one must generate a new random one. The probabil- 
ity that this happens is equal to the fraction p of occupied 
sites (bonds), and hence the average number of attempts 
we must make before we find an empty site is 1/(1 — p). 



N 



dp 

T^p 



= -JVlog(l-p c ), 



(8) 



or — Mlog(l — p c ) for bond percolation. If p c ~ 0.5, for 
example, this means we will have to generate ATog2 ~ 
0.693./V random numbers during the course of the run, 
rather than the N we would have had to generate to make 
a complete permutation of the sites. Thus it should be 
quicker not to generate the complete permutation. Only 
once p c becomes large enough that — log(l — p c ) ^ 1 
does it start to become profitable to calculate the entire 
permutation, i.e., whenp c > 1 — l/e~ 0.632. If one were 
to use a random selection scheme for calculations over 
the whole range < p < 1, then the algorithm would 
take time 



N 



1-1/ N 



dp 
~ ~P 



TV log A 



(9) 



to find and occupy all empty sites, which means overall 
operation would be O(AlogA) not O(N), so generat- 
ing the permutation is crucial in this case to ensure that 
running time is 0(A). 

One further slightly tricky point in the implementa- 
tion of our scheme is the performance of the convolution, 
Eq. (||) , of the results of the algorithm with the binomial 
distribution. Since the number of sites or bonds on the 
lattice can easily be a million or more, direct evaluation of 
the binomial coefficients using factorials is not possible. 
And for high-precis ion studies, such as the calculations 
presented in Section III , a Gaussian approximation to the 
binomial is not sufficiently accurate. Instead, therefore, 
we recommend the followingmethod of evaluation. The 
binomial distribution, Eq. (|lj), has its largest value for 
given A and p when n = n max = pN . We arbitrarily set 
this value to 1. (We will fix it in a moment.) Now we 
calculate B(N, n,p) iteratively for all other n from 



B(N,n,p) 



B(N,n-l,p)^±l^- p 
B(N,n + l,p)^^ 



Tl J> Ti max 
71 <. TL mayi . 



(10) 



Then we calculate the normalization coefficient C = 
^2 n B(N, n,p) and divide all the B(N, n,p) by it, to cor- 
rectly normalize the distribution. 



III. APPLICATIONS 

In this section we consider a number of applications 
of our algorithm to open problems in percolation theory. 
Not all of the results given here are new — some have ap- 
peared previously in Refs. |l^] and p8fl . They are gath- 
ered here to give an idea of the types of problems for 
which our algorithm is an appropriate tool. 
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FIG. 8. Two topologically distinct ways in which a percola- 
tion cluster can wrap around both axes of a two-dimensional 
periodic lattice. 

A. Measurement of the position of the percolation 
transition 

Our algorithm is well suited to the measurement of 
critical properties of percolation systems, such as position 
of the percolation transition and critical exponents. Our 
first example application is the calculation of the position 
p c of the percolation threshold for site percolation on the 
square lattice, a quantity for which we currently have no 
exact result. 

There are a large number of possible methods for de- 
termining the position of the percolation threshold nu- 
merically on a regular lattice different methods 
being appropriate with different algorithms. As discussed 
in Ref. p^j, our algorithm makes possible the use of a 
particularly attractive method based on lattice wrapping 
probabilities, which has substantially smaller finite-size 
scaling corrections than methods employed previously. 

We define Rl(p) to be the probability that for site 
occupation probability p there exists a contiguous cluster 
of occupied sites which wraps completely around a square 
lattice of I x L sit es with periodic boundary conditions. 
As in Section II D, cluster wrapping is here taken to be 



an estimator of the presence or absence of percolation on 
the infinite lattice. There are a variety of possible ways in 
which cluster wrapping can occur, giving rise to a variety 
of different definitions for R^: 

• R { £ ] and are the probabilities that there exists 
a cluster which wraps around the boundary condi- 
tions in the horizontal and vertical directions re- 
spectively. Clearly for square systems these two 
are equal. In the rest of this paper we refer only 

toi#>. 

• i?^ is the probability that there exists a cluster 
which wraps around either the horizontal or vertical 
directions, or both. 

• r£ is the probability that there exists a cluster 
which wraps around both horizontal and vertical 
directions. Notice that there are many topologi- 
cally distinct ways in which this can occur. Two of 



them, the "cross" configuration and the "single spi- 
ral" configuration, are illustrated in Fig. ||. R^' is 
defined to include both of these and all other pos- 
sible ways of wrapping around both axes. 

• R^ is the probability that a cluster exists which 
wraps around one specified axis but not the other 
axis. As with R^\ it does not matter, for the 
square systems studied here, which axis we specify. 

These four wrapping probabilities satisfy the equalities 



4 e) = R.W 



R (v) _ R (b) = 2R (h) 
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L ' 
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(i) _ P (M 



?(*>) _ p(e) 



R v>, _ R w = R y L 



4 h) = l(4 e) 



(ii) 
4 6) ). (i 2 ) 



so that only two of them are independent. They also 
satisfy the inequalities R^ } < R^ } < R^ } and R^ } < 

Each run of our algorithm on the L x L square system 
gives one estimate of each of these functions for the com- 
plete range of p. It is a crude estimate however: since 
an appropriate wrapping cluster either exists or doesn't 
for all values of p, the corresponding R^ (p) in the mi- 
crocanonical ensemble is simply a step function, except 
for i?£ , which has two steps, one up and one down. All 
four functions can be calculated in the microcanonical en- 
semble by finding the numbers n£ and ni ^ of occupied 
sites at which clusters first appear which wrap horizon- 
tally and vertically. (This means that, as discussed in 
Section II D , the algorithm can be halted once wrapping 
in both directions has occurred, which for the square lat- 
tice gives a saving of about 40% in the amount of CPU 
time used.) 

Our estimates of the four functions are improved by 
averaging over many runs of the algorithm, and the re- 
sults are then convolved with the binomial distribution, 
Eq. (||), to give smooth curves of Rl{p) in the canonical 
ensemble. (Alternatively one can perform the convolu- 
tion first and average over runs second; both are linear 
operations, so order is unimportant. However, the con- 
volution is quite a lengthy computation, so it is sensible 
to choose the order that requires it to be performed only 
once.) 

In Fig. H we show results from calculations of Rl(p) 
using our algorithm for the four different definitions, for 
systems of a variety of sizes, in the vicinity of the perco- 
lation transition. 

The reason for our interest in the wrapping probabili- 
ties is that their values can be calculated exactly. Exact 
expressions can be deduced from the work of Pinson [45]] 
and written directly in terms of the Jacobi ^-function 
f?3 (g) and the Dedekind ?y-function rj{q) . Pinson calcu- 
lated the probability, which he denoted w(Z x Z), of the 
occurrence on a square lattice of a wrapping cluster with 
the "cross" topology shown in the left panel of Fig. ||. 
By duality, our probability R^ (p c ) is just 1 — ir(Z x Z), 
since if there is no wrapping around either axis, then 
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FIG. 9. Plots of the cluster wrapping probability functions 
Rl(p) for L — 32, 64, 128 and 256 in the region of the per- 
colation transition for percolation (a) along a specified axis, 
(b) along either axis, (c) along both axes, and (d) along one 
axis but not the other. Note that (d) has a vertical scale 
different from the other frames. The dotted lines denote the 
expected values of p c and Roo{Pc). 



there must necessarily be a cross configuration on the 
dual lattice. This yields (HHl 

Rg(p c ) = l- 
#3(6-3^/8)^3(6-8^/3) _ tf 3 (e-3T/ 2 )tf3(e- 27r /3) 



2[r]{e- 2 ")} 2 



(13) 



The probability (p c ) for wrapping around exactly one 
axis is equal to the quantity denoted 7r(l,0) by Pinson, 
which for a square lattice can be written 



3^ 3 (e- 6 ") + z? 3 (e- 2 " /3 ) - 4^ 3 (e~ 8 " /3 ) 



(14) 



The remaining two probabilities R^ and i?S can now 
calculated from Eq. ([l2|). To ten figures, the resulting 
values for all four are: 

R£>(pc) = 0.521058290, R^(p c ) = 0.690473725, 
R^fa) = 0.351642855, Rg}(p c ) = 0.169415435. (15) 

If we calculate the solution p of the equation 

Rl(p) = Rco(pc), (16) 

we must have p — > p c as L — > 00, and hence this solution 
is an estimator for p c . Furthermore, it is a very good 
estimator, as we now demonstrate. 

In Fig. |l0|, we show numerical results for the finite-size 
convergence of Rl(p c ) to the L = 00 values for both site 



FIG. 10. Convergence with increasing system size of 
Rl(Pc) to its known value at L — 00 for (circles) and 

R^ (squares) under (a) site percolation and (b) bond per- 
colation. Filled symbols in panel (a) are exact enumeration 
results for system sizes L — 2 ... 6. The dotted lines show the 
expected slope if, as we conjecture, Rl{Pc) converges as L~ 2 
with increasing L. The data for R^ in panel (b) have been 
displaced upwards by a factor of two for clarity. 



and bond percolation. (For site percolation we used the 
current best-known value for p c from Ref. |^8| .) Note 
that the expected statistical error on Rl(p c ) is known 
analytically to good accuracy, since each run of the al- 
gorithm gives an independent estimate of Rl which is 
either 1 or depending on whether or not wrapping has 
occurred in the appropriate fashion when a fraction p c of 
the sites are occupied. Thus our estimate of the mean 
of Rl{Pc) over n runs is drawn from a simple binomial 
distribution which has standard deviation 



Rl(Pc)[1-Rl(Pc)} 



(17) 



If we approximate Rl(p c ) by the known value of Roo(Pc), 
then we can evaluate this expression for any n. 

As the figure shows, the finite-size corrections to Rl 
decay approximately as L~ 2 with increasing system size. 
For example, fits to the data for R L 1 ' (for which our 
numerical results are cleanest), give slopes of —1.95(17) 
(site percolation) and —2.003(5) (bond percolation). On 
the basis of these results we conjecture that Rl(p c ) con- 
verges to its L = 00 value exactly as L~ 2 . 

At the same time, the width of the critical region is 
decreasing as L~ l / V , so that the gradient of Rl{p) in the 
critical region goes as L 1 ^ . Thus our estimate p of the 
critical occupation probability from Eq. (|n3|) converges 
to p c according to 



P-Pc 



L -2-l/u =£ -H/4 



(18) 
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where the last equality makes use of the known value 
v — | for percolation on the square lattice. This con- 
vergence is substantially better than that for any other 
known estimator of p c . The best previously known con- 
vergence was p— p c ~ L~ 1 ~ 1 / u for certain estimates based 
upon crossing probabilities in open systems, while many 
other estimates, including the rcnormalization-group es- 
timate, converge as p — p c ~ L~ x l v Q]. It implies that 
we should be able to derive accurate results for p c from 
simulations on quite small lattices. Indeed we expect that 
statistical error will overwhelm the finite size corrections 
at quite small system sizes, making larger lattices not 
only unnecessary, but also essentially worthless. 

Statistical errors for the calculation are estimated in 
conventional fashion. From Eq. ( fl7|) we know that the 
error on the mean of Rl{p) over n runs of the algo- 
rithm goes as -nT 1 / 2 , independent of L, which implies 
that our estimate of p c carries an error a Pr scaling ac- 
cording to a Pc ~ wr x l' 1 L~ x l 1 ' . With each run taking 
time O(JV) = 0(L d ), the total time T ~ nL d taken for 
the calculation is related to <r Pc according to 



L d/2-i/» L i/i 



T 



T 



(19) 



where the last equality holds in two dimensions. Thus in 
order to make the statistical errors on systems of different 
size the same, we should spend an amount of time which 
scales as T& ~ L d ~ 2 ' v on systems of size L, or Tl ~ VL 
in two dimensions. 

In Fig. [ll] we show the results of a finite-size scal- 
ing calculation of this type for p c . The four different 
definitions of Rl give four (non-independent) estimates 
of p c : 0.59274621(13) for R [ L h \ 0.59274636(14) for R L e) , 

0.59274606(15) for R { L h) , and 0.59274629(20) for R { L 1] . 
The first of these is the best, and is indeed the most 
accurate current estimate of p c for site percolation on 
the square lattice. This calculation involved the simula- 
tion of more than 7 x 10 9 separate samples, about half of 
which were for systems of size 128 x 128. 

The wrapping probability R^ is of particular interest, 
because one does not in fact need to know its value for 
the infinite system in order to use it to estimate p c . Since 
this function is non-monotonic it may intercept its L = oo 
value at two different values of p, but its maximum must 
necessarily converge to p c at least as fast as either of these 
intercepts. And the position of the maximum can be 
determined without knowledge of the value of r2d(p c )- 
In fact, in the calculation shown in Fig. [Tl], we used the 

maximum of R^ to estimate p c and not the intercept, 



since in this case R L was strictly lower than R^,' for all 
p, so that there were no intercepts. 

The criterion of a maximum in R L ^ can be used to find 
the percolation threshold in other systems for which ex- 
act results for the wrapping probabilities are not known. 
As an example, we show in the inset of Fig. [ll] a finite- 
size scaling calculation of p c for three-dimensional perco- 
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FIG. 11. Finite size scaling of estimate for p c on square 
lattices of L x L sites using measured probabilities of clus- 
ter wrapping along one axis (circles), either axis (squares), 
both axes (upward-pointing triangles), and one axis but not 
the other (downward-pointing triangles). Inset: results of a 
similar calculation for cubic lattices of L x L x L sites using 
the probabilities of cluster wrapping along one axis but not 
the other two (circles), and two axes but not the other one 
(squares). 



lation on the cubic lattice (with periodic boundary condi- 
tions in all directions) using this measure. Here there are 
two possible generalizations of our wrapping probability: 

R i/ (p) is the probability that wrapping occurs along one 

axis and not the other two, and R L (p) is the probability 
that wrapping occurs along two axes and not the third. 
We have calculated both in this case. 

Although neither the exact value of v nor the expected 
scaling of this estimate of p c is known for the three- 
dimensional case, we can estimate p c by varying the scal- 
ing exponent until an approximately straight line is pro- 
duced. This procedure reveals that our estimate of p c 
scales approximately as L~ 2 in three dimensions, and we 
derive estimates of p c — 0.3097(3) and 0.3105(2) for the 
position of the transition. Combining these results we 
estimate that p c — 0.3101(10), which is in reasonable 
agreement with the best known result for this quantity 
of 0.3116080(4) HH). (Only a short run was performed 
to obtain our result; better results could presumably be 
derived with greater expenditure of CPU time.) 



B. Scaling of the wrapping probability functions 



It has been proposed |5fj,|51|] that below the percolation 
transition the probability of finding a cluster which spans 
the system, or wraps around it in the case of periodic 
boundary conditions, should scale according to 



R L (p) ~ exp(-L/0. 



(20) 
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FIG. 12. Demonstration of the stretched exponential be- 
havior, Eq. (pl|), in simulation results for 1024 x 1024 square 
systems. The results are averaged over 1 000 000 runs of the 
algorithm. The curves are for (top to bottom) v — 1.2, 1.3, 
1.4, 1.5, 1.6, and 1.7. Inset: the same data replotted loga- 
rithmically to show the stretched exponential behavior. The 
dotted line indicates the expected slope of | . 



The probability of wrapping around the system is equal 
to the trace of the product of the transfer matrices 
for the L rows of the system. With periodic bound- 
ary conditions, the transfer matrices are equal for all 
rows, and the wrapping probability is thus a simple sum 
of the Lth powers of the eigenvalues A.; of the individ- 
ual transfer matrices: Rl(p) = J2i^f- For large L, 
this sum is dominated by the largest eigenvalue Ao and 
Rl{p) = Aq = exp(LlogAo). Comparing with Eq. (20), 
we conclude that the leading constant in ( pfj| ) must tend 
to unity as L becomes large, and thus 



R L (p) = exp(-L/£) = cxp[-cL(p c - p) v ], 



(21) 



where c is another constant. In other words, as a func- 
tion of p c ~ p, the wrapping probability should follow a 
stretched exponential with exponent v and a leading con- 
stant of 1. This contrasts with previous conjectures that 
Rl(p) has Gaussian tails [Qj5|j5^]. 

The behavior (j2l]) is only seen when the correlation 
length is substantially smaller than the system dimen- 
sion, but also greater than the lattice spacing, i.e., 1 <C 
£ <C L. This means that in order to observe it clearly 
we need to perform simulations on reasonably large sys- 
tems. In Fig. [l^ we show results for site percolation on 
square lattices of 1024 x 1024 sites, with log iij, plot- 
ted against (p c — p) u for various values of v, to look for 
straight-line behavior. Interpretation of the results is a 
little difficult, since one must discount curvature close to 
the origin where £ > L. However, the best straight line 
seems to occur in the region of v = 1.4 ± 0.1, in agree- 



ment with the expected v = | , while strongly ruling out 
the Gaussian behavior. 

A better demonstration of this result is shown in the 
inset of the figure. Here we plot log(— log(i?i)) as a func- 
tion of log(p c — p), which, since the leading constant in 
Eq. (|2lJ) is equal to unity, should give a straight line with 
slope ;| in the regime where 1 < ^ « L. This behavior 
is clearly visible in the figure. Note that this kind of plot 
is only valid for periodic boundary conditions, since the 
leading constant in Eq. ( p0| ) is not in general equal to 1 
in other cases. 



C. Percolation on random graphs 

For our last example, we demonstrate the use of our 
algorithm on a system which is not built upon a regu- 
lar lattice. The calculations of this section are instead 
performed on random graphs, i.e., collections of sites 
(or "vertices" ) with random bonds (or "edges" ) between 
them. 

Percolation can be considered a simple model for the 
robustness of networks Jlj],[l8| • In a communications net- 
work, messages are routed from source to destination by 
passing them from one vertex to another through the 
network. In the simplest approximation, the network 
between a given source and destination is functional so 
long as a single path exists from source to destination, 
and non-functional otherwise. In real communications 
networks such as the Internet, a significant fraction of 
the vertices (routers in the case of the Internet) are non- 
functional at all times, and yet the majority of the net- 
work continues to function because there are many pos- 
sible paths from each source to each destination, and it is 
rare that all such paths are simultaneously interrupted. 
The question therefore arises: what fraction of vertices 
must be non-functional before communications are sub- 
stantially disrupted? This question may be rephrased 
as a site percolation problem in which occupied vertices 
represent functional routers and unoccupied vertices non- 
functional ones. So long as there is a giant component of 
connected occupied vertices (the equivalent of a perco- 
lating cluster on a regular lattice) then long-range com- 
munication will be possible on the network. Below the 
percolation transition, where the giant component disap- 
pears, the network will fragment into disconnected clus- 
ters. Thus the percolation transition represents the point 
at which the network as a whole becomes non-functional, 
and the size of the giant component, if there is one, repre- 
sents the fraction of the network which can communicate 
effectively. 

Both the position of the phase transition and the size 
of the giant component have been calculated exactly by 
Callaway et al. [|18| for random graphs with arbitrary de- 
gree distributions. The degree A; of a vertex in a network 
is the number of other vertices to which it is connected. 
If pk is the probability that a vertex has degree k and q 
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FIG. 13. Simulation results (points) for site percolation on 
random graphs with degree distribution given by Eq. (pi|), 
with r = 2.5 and three different values of n. The solid lines 
are the exact solution of Callaway et al. jl^] , Eq. (^3|) , for the 
same parameter values. 



three different values of the cutoff parameter k, along 
with the exact solution derived by numerical iteration of 
Eq. (^3|). As the figure shows, the two are in excellent 
agreement. The simulations for this figure took about an 
hour in total. We would expect a simulation performed 
using the standard depth-first search and giving results 
of similar accuracy to take about a million times as long, 
or about a century. 

A number of authors |ll||5(||37j have examined the re- 
silience of networks to the selective removal of the vertices 
with highest degree. This scenario can also be simulated 
efficiently using our algorithm. The only modification 
necessary is that the vertices are now occupied in order 
of increasing degree, rather than in random order as in 
the previous case. We note however that the average 
time taken to sort the vertices in order of increasing de- 
gree scales as 0(N\ogN) when using standard sorting 
algorithms such as quicksort pH , and hence this calcu- 
lation is dominated by the time to perform the sort for 
large TV, making overall running time 0(AHog N) rather 
than O(N). 

IV. CONCLUSIONS 



is the occupation probability for vertices, then the per- 
colation transition takes place at 



E^Lo fc ( fc - l )Pk 

k=0 k Pk 



(22) 



and the fraction S of the graph filled by the giant com- 
ponent is the solution of 



q-q 2_^P kU 

k=0 



Efclo k Pk 



fc-1 



(23) 

For the Internet, the degree distribution has been found 
to be power-law in form |5J] , though in practice the power 
law must have some cutoff at finite k. Thus a reasonable 
form for the degree distribution is 



p k = Ck r e 



-k/K 



for k > 1. 



(24) 



The exponent r is found to take values between 2.1 and 
2.5 depending on the epoch in which the measurement 
was made and whether one looks at the network at the 
router level or at the coarser domain level. Vertices with 
degree zero are excluded from the graph since a vertex 
with degree zero is necessarily not a functional part of 
the network. 

We can generate a random graph of N vertices with 
this degree distribution in O(N) time using the prescrip- 
tion given in Ref. |55| and then use our percolation al- 
gorithm to calculate, for example, the size of the largest 
cluster for all values of q. In Fig. |l3| we show the results 
of such a calculation for N = 1 000 000 and r = 2.5 for 



We have described in detail a new algorithm for study- 
ing site or bond percolation on any lattice which can 
calculate the value of an observable quantity for all val- 
ues of the site or bond occupation probability from zero 
to one in time which, for all practical purposes, scales 
linearly with lattice volume. We have presented a time 
complexity analysis demonstrating this scaling, empiri- 
cal results showing the scaling and comparing running 
time to other algorithms for percolation problems, and 
a description of the details of implementation of the al- 
gorithm. A full working program is presented in the fol- 
lowing appendix and is also available for download. We 
have given three example applications for our algorithm: 
the measurement of the position of the percolation tran- 
sition for site percolation on a square lattice, for which 
we derive the most accurate result yet for this quantity; 
the confirmation of the expected |-power stretched expo- 
nential behavior in the tails of the wrapping probability 
functions for percolating clusters on the square lattice; 
and the calculation of the size of the giant component 
for site percolation on a random graph, which confirms 
the recently published exact solution for the same quan- 
tity. 
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APPENDIX A: PROGRAM 

In this appendix we give a complete program in C for 
our algorithm for site percolation on a square lattice of 
N = L x L sites with periodic boundary conditions. This 
program prints out the size of the largest cluster on the 
lattice as a function of number of occupied sites n for 
values of n from 1 to N . The entire program consists of 
73 lines of code. 

First we set up some constants and global variables: 

#include <stdlib.h> 
#include <stdio.h> 



#define L 128 
#define N (L*L) 
#define EMPTY (-N-1) 

int ptr [N] ; 
int nn [N] [4] ; 
int order [N] ; 



/* Linear dimension */ 



/* Array of pointers */ 
/* Nearest neighbors */ 
/* Occupation order */ 



Sites are indexed with a single signed integer label for 
speed, taking values from to N — 1. Note that on com- 
puters which represent integers in 32 bits, this program 
can, for this reason, only be used for lattices of up to 
2 31 ~ 2 billion sites. While this is adequate for most pur- 
poses, longer labels will be needed if you wish to study 
larger lattices. 

The array ptr [] serves triple duty: for non-root oc- 
cupied sites it contains the label for the site's parent in 
the tree (the "pointer"); root sites are recognized by a 
negative value of ptr [] , and that value is equal to minus 
the size of the cluster; for unoccupied sites ptr [] takes 
the value EMPTY. 

Next we set up the array nn [] [] which contains a list 
of the nearest neighbors of each site. Only this array 
need be changed in order for the program to work with 
a lattice of different topology. 

void boundaries () 
{ 

int i ; 

for (i=0; i<N; i++) { 
nn[i] [0] = (i+l)7.N; 
nn[i] [1] = (i+N-iy/.N; 
nn[i] [2] = (i+L)7„N; 
nn[i] [3] = (i+N-L)°/.N; 
if (i°/„L==0) nn[i] [1] = i+L-1; 
if ((i+l)7.L==0) nn[i][0] = i-L+1; 

} 

} 

Now we generate the random order in which the sites will 
be occupied, by randomly permuting the integers from 
to N - 1: 

void permutation () 



int i , j ; 
int temp; 

for (i=0; i<N; i++) order[i] = i; 
for (i=0; i<N; i++) { 

j = i + (N-i)*drand() ; 

temp = order [i] ; 

order [i] = order [j]; 

order [j] = temp; 

> 



Here the function drandO generates a random double 
precision floating point number between and 1. Many 
people will have such a function already to hand. For 
those who don't, a suitable one is supplied with Ref. fl39| . 

We also define a function which performs the "find" 
operation, returning the label of the root site of a clus- 
ter, as well as path compression. The version we use is 



recursive, as described in Section II D 



int f indroot (int i) 
{ 

if (ptr[i]<0) return i; 

return ptr[i] = f indroot (ptr [i] ) ; 

} 

The code to perform the actual algorithm is quite brief. 
It works exactly as described in the text. Sites are oc- 
cupied in the order specified by the array order [] . The 
function f indroot () is called to find the roots of each 
of the adjacent sites. If amalgamation is needed, it is 
performed in a weighted fashion, smaller clusters being 
added to larger (bearing in mind that the value of ptr [] 
for the root nodes is minus the size of the corresponding 
cluster) . 

void percolate () 
{ 

int i , j ; 
int sl,s2; 
int rl,r2; 
int big=0; 

for (i=0; i<N; i++) ptr[i] = EMPTY; 
for (i=0; i<N; i++) { 
rl = si = order [i] ; 
ptr [si] = -1; 
for (j=0; j<4; j++) { 
s2 = nn[sl] [j] ; 
if (ptr [s2] ! =EMPTY) { 
r2 = f indroot (s2) ; 
if (r2!=rl) { 

if (ptr[rl]>ptr[r2]) { 
ptr [r2] += ptr [rl] ; 
ptr[rl] = r2; 
rl = r2; 
} else { 
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ptr [rl] += ptr [r2] ; 
ptr[r2] = rl; 

} 

if (-ptr[rl]>big) big = -ptr[rl] ; 

> 

} 

} 

printfO'y.i 7.i\n",i+l,big); 

> 

} 

The main program is now simple: 

mainO 
{ 

boundaries () ; 
permutationO ; 
percolateO ; 

} 

A complete working version of this program can also be 
downloaded from the Internet ||{| . 

While our recursive implementation of the function 
findrootO is concise, some readers may find it unsat- 
isfactory, either because they are using a compiler under 
which recursive code runs slowly, or because they want 
to translate the program into another language, such as 
Fortran 77, which does not support recursion. For their 
benefit we give here two alternative implementations of 
this function, neither of which makes use of recursion. 
The first of these is a straightforward implementation 
combining the find operation with path compression, as 
before, but using an explicit stack: 

#define STACKSIZE 100 

int findroot (int i) 
{ 

int r; 
int sp=0 ; 

int stack [STACKSIZE] ; 
r = i; 

while (ptr[r]>=0) { 
stack [sp++] = r; 
r = ptr [r] ; 

> 

while (sp) ptr [stack [ — sp]] = r; 
return r; 

} 

The stack used is small, having just 100 elements. This 
should be more than sufficient in almost all cases, since 
the average distance traversed across the tree is only 
about 3. 

A more elegant way to implement findrootO with- 
out recursion is to modify the union/find algorithm it- 
self slightly. There is, it turns out, another union/find 
algorithm which runs in O(N) time. In this algorithm 
the union operation is as before, but the find operation 



now involves "path halving" instead of path compres- 
sion. With path halving, each pointer along the path 
traversed is changed to point to its "grandparent" in the 
tree, which effectively halves the length of the path from 
a site to the root of the cluster each time findroot () is 
called. Tarjan |58| has shown that this find operation also 
runs asymptotically in very nearly constant time, giving 
an algorithm which runs in linear time overall. Here is 
a version of the function findrootO which implements 
path halving: 

int findroot (int i) 
{ 

int r , s ; 

r = s = i ; 

while (ptr[r]>=0) { 

ptr [s] = ptr [r] ; 

s = r; 

r = ptr [r] ; 

} 

return r; 

} 



[1] D. Stauffer and A. Aharony, Introduction to Percolation 
Theory, 2nd edition, Taylor and Francis, London (1992). 

[2] P. G. de Gennes and E. Guyon, J. de Mecanique 3, 403 
(1978). 

[3] R. G. Larson, L. E. Scriven, and H. T. Davis, Chem. Eng. 

Set. 15, 57 (1981). 
[4] M. Sahimi, J. Physique I 4, 1263 (1994). 
[5] T. Odagaki and S. Toyofuku, J. Phys. CM 10, 6447 

(1998). 

[6] J. Tobochnik, Phys. Rev. E 60, 7137 (1999). 

[7] S. de Bondt, L. Froyen, and A. Deruyttere, J. Mater. Sci. 

27, 1983 (1992). 
[8] A. Bunde, S. Havlin, and M. Porto, Phys. Rev. Lett. 74, 

2714 (1995). 

[9] D. P Bentz and E. J. Garboczi, Materials and Structures 
25, 523 (1992). 
[10] J. Machta, Phys. Rev. Lett. 66, 169 (1991). 
[11] K. Moon and S. M. Girvin, Phys. Rev. Lett. 75, 1328 
(1995). 

[12] L. de Arcangelis, S. Redner, A. Coniglio, Phys. Rev. B 

31, 4725 (1985). 
[13] C. L. Henley, Phys. Rev. Lett. 71, 2741 (1993). 
[14] K. A. With and T. O. Crist, Ecology 76, 2446 (1995). 
[15] M. E. J. Newman and D. J. Watts, Phys. Rev. E 60, 

7332 (1999). 

[16] C. Moore and M. E. J. Newman, Phys. Rev. E 62, 7059 
(2000). 

[17] R. Cohen, K. Erez, D. ben-Avraham, and S. Havlin, 
Phys. Rev. Lett. 85, 4626 (2000). 



1G 



[18] D. S. Callaway, M. E. J. Newman, S. H. Strogatz, and 
D. J. Watts, Phys. Rev. Lett. 85, 5468 (2000). 

[19] T. S. Ray and N. Jan, Phys. Rev. Lett. 72, 4045 (1994). 

[20] S. Solomon, G. Weisbuch, L. de Arcangelis, N. Jan, and 
D. Stauffer, Phystca A 277, 239 (2000). 

[21] An algorithm does exist, for two dimensional systems 
only, which will generate the hull of the spanning cluster 
without populating the rest of the lattice. If one is inter- 
ested only in this cluster, then this algorithm is faster. 
In particular, because the system-spanning hull in two 
dimensions is known to occupy a fraction of the lattice 
which scales as iV 7//8 at criticality, the algorithm will run 
in time 0(iV 7/8 ) in the critical region. See R. M. Ziff, P. 
T. Cummings, and G. Stell, J. Phys. A 17, 3009 (1984). 

[22] C.-K. Hu, Phys. Rev. B 46, 6592 (1992). 

[23] H. Gould and J. Tobochnik, An Introduction to Com- 
puter Simulation Methods, 2nd edition, p. 444, Addison- 
Wesley, Reading, MA (1996). 

[24] L. N. Shchur and O A. Vasilyev, cond-mat/0005448 . 

[25] A. M. Ferrenberg and R. H. Swendsen, Phys. Rev. Lett. 
61, 2635 (1988). 

[26] A. M. Ferrenberg, D. P. Landau, and R. H. Swendsen, 
Phys. Rev. E 51, 5092 (1995). 

[27] M. E. J. Newman and R. G. Palmer, J. Stat. Phys. 97, 
1011 (1999). 

[28] M. E. J. Newman and R. M. Ziff, Phys. Rev. Lett. 85, 
4104 (2000). 

[29] C. Moukarzel, Int. J. Mod. Phys. C 9, 887 (1998). 

[30] J. E. de Freitas, L. S. Lucena, and S. Roux, Physica A 

266, 81 (1999). 
[31] R. Sedgewick, Algorithms, 2nd edition, Addison-Wesley, 

Reading, Mass. (1988). 
[32] D. E. Knuth, The Art of Computer Programming, Vol. 1: 

Fundamental Algorithms, 3rd Edition, Addison-Wesley, 

Reading, Mass. (1997). 
[33] B. A. Galler and M. J. Fischer, Comm. ACM 7, 301 

(1964). 

[34] M. J. Fischer, in Complexity of Computer Calculations, 
R. E. Miller and J. W. Thatcher (eds.), Plenum Press, 
New York (1972). 

[35] R. E. Tarjan, J. ACM 22, 215 (1975). 

[36] W. Ackermann, Math. Ann. 99, 118 (1928). 

[37] We did not, as the reader may have guessed, actually run 
the depth-first search algorithm for 4.5 million seconds 
(about 2 months), since this would have been wasteful of 
computing resources. Instead, we ran the algorithm for 
100 representative values of n, the number of occupied 
bonds, and scaled the resulting timing up to estimate 
the run-time for all 1 000 000 values. 

[38] J. Hoshen and R. Kopelman, Phys. Rev. B 14, 3438 
(1976). 

[39] http : //www. santaf e . edu/~mark/per eolation/ 

[40] M. Z. Bazant, Phys. Rev. E 62, 1660 (1999). 

[41] J. Machta, Y. S. Choi, A. Lucke, T. Schweizer and L. M. 

Chayes, Phys. Rev. E 54, 1332 (1996). 
[42] P. J. Reynolds, H. E. Stanley, and W. Klein, J. Phys. A 

11, L199 (1978). 
[43] P. J. Reynolds, H. E. Stanley, and W. Klein, Phys. Rev. 

B 21, 1223 (1980). 
[44] R. M. Ziff, Phys. Rev. Lett. 69, 2670 (1992). 



[45] 
[46] 

[47] 



J. 
D. 



Stat. Phys. 75, 1167 (1994). 

Lorenz, and P. Kleban, Physica A 266, 



[49] 

[50] 
[51] 
[52] 

[53] 
[54] 

[55] 

[56] 

[57] 

[58] 



H. T. Pinson 
R. M. Ziff, C 
17 (1999). 

Sometimes (for example, in the symbolic manipulation 
program Mathematica) the 77-function is defined in com- 
plex argument form. If we were to use this definition, the 
^-functions in the denominators of Eqs. ( |l3"| ) and (|l~ 
would become r](i). 
C. D. Lorenz and R. M. Ziff, J. Phys. A 31, 8147 (1998) 
H. G. Ballesteros, L. A. Fernandez, V. Martin-Mayor, 
A. Mufioz Sudupe, G. Parisi, and J. J. Ruiz-Lorenzo, J. 
Phys. A 32, 1 (1999). 

L. Berlyand and J. Wehr, J. Phys. A 28, 7127 (1995). 
J.-P. Hovi and A. Aharony, Phys. Rev. E 53, 235 (1996). 
M. E. Levinshtein, B. I. Shklovskii, M. S. Shur, and A. 
L. Efros, Sov. Phys. JETP 42, 197 (1976). 
F. Wester, Int. J. Mod. Phys. C 11, 843 (2000). 
M. Faloutsos, P. Faloutsos, and C. Faloutsos 
Comm. Rev. 29, 251 (1999). 
M. E. J. Newman, S. H. Strogatz, and D. J 



Comp. 



Watts, 



cond-mat/0007235 



R. Albert, H. Jeong, and A.-L. Barabasi, Nature 406, 
378 (2000). 

A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Ra- 
jagopalan, R. Stata, A. Tomkins, and J. Wiener, Com- 
puter Networks 33, 309 (2000). 

R. E. Tarjan, Data Structures and Network Algo- 
rithms, Society for Industrial and Applied Mathematics, 
Philadelphia (1983). 



17 



